System for monitoring a service provider partner

ABSTRACT

A system is disclosed for monitoring a service provider partner. A historical dataset corresponding to a historical behavior of a set of service provider partners may be identified. The historical dataset may be processed to identify a feature vector relating to detecting a fraudulent service provider partner. A classifier model may be generated from the historical dataset and the feature vector. Current service provider partner data representing a current service provider partner may be collected. The current service provider partner data may be processed to generate a current service provider partner feature vector. A score representing the likelihood that the service provider partner is fraudulent may be generated by applying the classifier model to the current service provider partner data feature vector. A monitor may be identified and notified of the score of the current service provider partner. The monitor may perform some action based on the score of the current service provider partner.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation of U.S. patent applicationSer. No. 11/644,586, filed Dec. 21, 2006 (pending), which isincorporated by reference herein.

TECHNICAL FIELD

The present description relates generally to a system for monitoring aservice provider partner, and more particularly, but not exclusively,for monitoring a service provider partner to determine if they may be aprofitable service provider partner or an unprofitable service providerpartner.

BACKGROUND

Content providers may derive a portion of their revenues from onlineadvertising. The revenue may be generated by revenue generators, such asadvertisers, who may pay content providers to have their advertisementsdisplayed to users. The advertiser may maintain a funded account withthe content provider through an automated system such as YAHOO!'S SEARCHMARKETING system. The system may deduct funds from the account each timethe advertiser incurs a charge, such as when an advertisement of theadvertiser is displayed to a user or when a user clicks on theadvertiser's advertisement. When the advertiser's account is depleted ofall its funds, the system may require the advertiser to replenish theiraccount with additional funds before displaying any furtheradvertisements.

The automated nature of a system, such as YAHOO! SEARCH MARKETING, maysimplify the process of displaying ads for an advertiser; however it mayalso provide fraudulent advertisers with non-traditional venues fordefrauding content providers and/or individual users. A fraudulentadvertiser may be able to use such an automated system to direct usersto scam web sites, such as a “phishing” site, where users are routinelydefrauded of their credit card information and/or personal information.Since the user may have been directed to the fraudulent web site by theadvertisements displayed by the content provider, the user may associatethe content provider with the fraudulent web site, thereby diminishingthe general good will of the content provider.

Fraudulent advertisers may also use non-traditional methods to defraudthe content providers' systems. The fraudulent advertisers may be ableto abuse the systems to rapidly accumulate charges in excess of theamount of funds in their account. If the fraudulent advertiser exceeds adaily budget limit that the content provider agreed to abide by or ifthe fraudulent advertiser funded their account with a prepaid debit orcredit card, the content provider may have no opportunity to receivepayment of the excessive charges.

Some current advertising platforms, such as YAHOO! SEARCH MARKETING, mayrely on traditional credit verification systems for identifyingfraudulent advertisers. A traditional credit verification system mayrely on traditional methods in order to identify fraudulent advertisers,such as matching an advertiser's address with the address registered tothe credit card used by the advertiser. These traditional creditverification systems may be unable to identify high risk advertisers whoare likely to commit the non-traditional types of advertiser fraudmentioned above.

SUMMARY

A system for monitoring a service provider partner may include: amemory, an interface and a processor. The memory may be able to be tostore a classifier model, one or more feature vectors, data relating tohistorical service provider partners, and data relating to a currentservice provider partner. The interface may be operatively connected tothe memory and may collect data relating to the current service providerpartner and communicate with a monitor. The processor may be operativelyconnected to the memory and the interface. The processor may process thehistorical service provider partner data to identify a feature vectorrelating to detecting a fraudulent service provider partner, maygenerate a classifier model from the historical dataset and the featurevector, may process the collected current service provider partner datato generate a feature vector associated with the current serviceprovider partner, may generate a score relating to the service providerpartner's behavior by applying the classifier model to the featurevector relating to the current service provider partner and maycommunicate the score to the monitor. The monitor may perform someaction based on the score of the current service provider partner.

Other systems, methods, features and advantages will be, or will become,apparent to one with skill in the art upon examination of the followingfigures and detailed description. It is intended that all suchadditional systems, methods, features and advantages be included withinthis description, be within the scope of the embodiments, and beprotected by the following claims and be defined by the followingclaims. Further aspects and advantages are discussed below inconjunction with the description.

BRIEF DESCRIPTION OF THE DRAWINGS

Non-limiting and non-exhaustive descriptions are provided with referenceto the following drawings. In the drawings, like reference numerals mayrefer to like parts throughout the various figures unless otherwisespecified.

FIG. 1 is a block diagram of a general overview of a system for scoringrevenue generators.

FIG. 2 is a block diagram of a simplified view of a network environmentimplementing a system for scoring revenue generators.

FIG. 3 is a block diagram of a view of a system for scoring revenuegenerators.

FIG. 4 is a flowchart illustrating basic operations of the systems ofFIG. 1, FIG. 2, and FIG. 3 or other systems for scoring revenuegenerators.

FIG. 5 is a flowchart illustrating in more detail the operations of thesystems of FIG. 1 and FIG. 2, and FIG. 3 or other systems for scoringrevenue generators.

FIG. 6 is a flowchart illustrating steps that may be taken by a revenuegenerator in the systems of FIG. 1, FIG. 2, and FIG. 3 or other systemsfor scoring revenue generators.

FIG. 7 is a flowchart illustrating steps that may be taken by a monitorin a system for scoring revenue generators.

FIG. 8 is a screenshot of an implementation of a monitor interface inthe systems of FIG. 1, FIG. 2 and FIG. 3, or other systems for scoringrevenue generators.

FIG. 9 is a screenshot of an implementation of a monitor interface inthe systems of FIG. 1, FIG. 2, and FIG. 3 or other systems for scoringrevenue generators.

FIG. 10 is a screenshot of the output of a clustering algorithm run on ahistorical dataset.

FIG. 11 is a screenshot of a daily absolute advertiser search term riskgraph containing daily absolute advertiser search term risk data thatmay be displayed to a monitor.

FIG. 12 is a screenshot of a daily relative advertiser search term riskgraph containing daily relative advertiser search term risk data thatmay be displayed to a monitor.

FIG. 13 is a screenshot of a daily spend amount for an advertiser graphcontaining daily spend amount data for an advertiser that may bedisplayed to a monitor.

FIG. 14 is a screenshot of an hourly spend amount for an advertisergraph containing hourly spend amount data for an advertiser that may bedisplayed to a monitor.

FIG. 15 is a screenshot of a monthly spend amount for an advertisergraph containing monthly spend amount data for an advertiser that may bedisplayed to a monitor.

FIG. 16 is a model of an implementation of a system for scoring revenuegenerators.

FIG. 17 is a class diagram of an implementation of a system for scoringrevenue generators.

FIG. 18 is a use case of an implementation of a system for scoringrevenue generators.

FIG. 19 is a graphic demonstrating the machine learning process that maybe used in a system for scoring revenue generators.

DETAILED DESCRIPTION

A system, and method, generally referred to as a system, relate todetecting fraudulent revenue generators, such as advertisers, and moreparticularly, but not exclusively, to detecting advertisers who may belikely to commit fraud in an online advertising system.

The principles described herein may be embodied in many different forms.The system may allow an entity to identify revenue generators who may belikely to commit fraud. The system may allow an entity to preventrevenue generator fraud from occurring by identifying and handlingrevenue generators who may be likely to commit fraud. The system mayallow an entity to identify revenue generators who may be likely togenerate significant revenue and those who may be likely to generateinsignificant revenue. The system may allow an entity to maximizerevenue by efficiently identifying and handling revenue generators whoare likely to generate significant revenue and those likely to generateinsignificant revenue.

FIG. 1 provides a general overview of a system 100 for scoring revenuegenerators. The system 100 may include one or more revenue generators110A-N, such as advertisers, a service provider 130, one or moremonitors 120A-N and users 150, which may represent one or more users,such as web surfers. The revenue generators in 110A-N may pay theservice provider 130 to display ads, such as on-line ads on a networksuch as the Internet. The users 150 may interact with the ads of therevenue generators 110A-N displayed by the service provider 130, such asby clicking on an ad. The revenue generators 110A-N may pay the serviceprovider 130 when the users 150 interact with the ads of the revenuegenerators 110A-N. The payments may be based on various factors, such asthe number of ads displayed and/or the number of times a user clicksthrough to the web site of a revenue generator A 110A.

In the system 100, the revenue generators 110A-N may interact with theservice provider 130, such as via a web application. The revenuegenerator 110A-N may send information, such as billing, website andadvertisement information, to the service provider 130 via the webapplication. The web application may include a web browser or otherapplication such as any application capable of displaying web content.The application may be implemented with a processor such as a personalcomputer, personal digital assistant, mobile phone, or any other machinecapable of implementing a web application. The monitors 120A-N may alsointeract individually with the service provider 130, such as via a webapplication. The monitors 120A-N may include administrators of thesystem, such as an administrator of the system may also perform thefunctions of the monitors 120A-N. The monitors 120A-N may interact withthe service provider 130 via a web based application or a standaloneapplication. The service provider 130 may communicate data to therevenue generators 110A-N and the monitors 120A-N over a network. Thefollowing examples may refer to a revenue generator A 110A as an onlineadvertiser; however the system 100 may apply to any revenue generatorswho interact with a service provider 130, such as service providerpartners.

In operation, one of the revenue generators 110A-N, such as revenuegenerator A 110A, may provide information to the service provider 130.This information may relate to the transaction taking place between therevenue generator A 110A and the service provider 130, or may relate toan account the revenue generator A 110A maintains with the serviceprovider 130. In the case of a revenue generator A 110A who is an onlineadvertiser, the revenue generator A 110A may provide initial informationnecessary to open an account with the service provider 130. The serviceprovider 130 may use this initial information to make a determinationabout whether the revenue generator A 110A may be a fraudulent revenuegenerator. The service provider 130 may provide a venue for the revenuegenerator A 110A to display their advertisements. Throughout the accountlife cycle, the service provider 130 may continue to collect informationregarding the behavior of the revenue generator A 110A. This informationmay be used to further refine the determination of whether the revenuegenerator A 110A may be a fraudulent revenue generator.

A revenue generator A 110A who is an online advertiser may maintainseveral accounts with the service provider 130. For each account therevenue generator A 110A may maintain several listings. A listing mayinclude a search term, a URL, a bid amount and a rank. The search termmay represent a term that the revenue generator A 110A wishes toassociate their advertisement with in a search engine environment. TheURL may represent the link the revenue generator A 110A wishes the users150, such as web surfers, to be directed to upon clicking on theadvertisement of the revenue generator A 110A, such as the home page ofthe revenue generator A 110A. The bid amount may represent a maximumamount the revenue generator A 110A may be willing to spend when theusers 150 may click on their advertisement or when their advertisementmay be shown to the users 150. The rank may be automatically populatedby the system 100 and may represent where the bid of the revenuegenerator A 110A ranks compared to the bids of other revenue generators110B-N for the same search term.

The revenue generator A 110A may also supply a daily budget for eachaccount, which may represent the maximum amount of charges the serviceprovider 130 may charge to each account on a given day. If this budgetvalue is reached on a given day for a given account, or other period oftime identified for the budget, the service provider 130 may stopdisplaying the advertisements for the given account of the revenuegenerator A 110A until the next day or until the expiration of someother defined period of time. The daily budget amount, the bid amounts,the search terms bid on and the URL the revenue generator A 110A directsusers 150 to may all be collected by the service provider 130 and usedto determine if the revenue generator A 110A is likely to be afraudulent revenue generator. The service provider 130 may continue tocollect information on the revenue generator A 110A, such as the averageamount of payments made, the total sum of charges accrued in a month,and other values that may be useful in classifying the revenue generatorA 110A as a fraudulent or not fraudulent revenue generator.

The service provider 130 may use the data collected from the revenuegenerator A 110A to determine a score, such as regarding a likelihood ofthe revenue generator A 110A being a fraudulent revenue generator. Forthe sake of explanation, the description is described in terms ofdetermining a score regarding the likelihood that the revenue generatorA 110A is fraudulent, but the score may be used in other ways, such asto determine revenue generators 110A-N that are not fraudulent. Thedescribed system 100 may also be used with entities other than therevenue generators 110A-N. The service provider 130 may take some actionbased on the score of the revenue generator A 110A, such as taking allof the accounts of the revenue generator A 110A offline, resulting in afreeze in the service provided by the service provider 130 to therevenue generator A 110A. The service provider 130 may output the scoreof the revenue generator A 110A to one of the monitors 120A-N, such asmonitor A 120A. Alternatively or in addition, the service provider 130may flag the revenue generator A 110A as requiring attention and maynotify one of the monitors 120A-N, such as the monitor A 120A, that therevenue generator A 110A requires attention.

Alternatively or in addition to, if the score indicates the revenueprovider A 110A may be fraudulent, the service provider 130 mayautomatically set a risk status value of the account of the revenuegenerator A 110 to “Unacceptable Offline”, the URL associated with theaccount may be added to a ban list, and the service provider 130 mayflag the account for review by one of the monitors 120A-N. Alternativelyor in addition to, the URL domain or contact information associated withthe account, such as name, address, phone, and email address, may beadded to a ban list. If a URL, URL domain or contact information areadded to a ban list they may be disassociated from all accounts and maynot be associated with accounts in the future. If an account is flaggedfor monitor review, a monitor A 120A may log into the system 100 andreview the account and may determine whether the account was properlyscored and whether the account received the proper risk status value.The service provider 130 may also flag an account to be audited by oneof the monitors 120A-N.

Furthermore a specific monitor A 120A may be associated with specificrevenue generators 110A-N. Alternatively or in addition, multiplemonitors 120A-N may be associated with a specific revenue generator A110A or all the monitors 120A-N may be associated with all revenuegenerators 110A-N. Any time the service provider 130 communicates with,or notifies a monitor A 120A, the service provider 130 may notify allmonitors 120A-N, or may only notify the monitors 120A-N associated withthe revenue generator A 110A.

The monitor A 120A may review the score of the revenue generator A 110Aand the data of the revenue generator A 110A. The monitor A 120A maytake some action based on the score, such as putting the accounts of therevenue generator A 110A online if the monitor A 120A determines thatthe revenue generator A 110A was improperly identified as a fraudulentrevenue generator. Alternatively or in addition, the monitor A 120A mayplace a ban on the URL used by the revenue generator A 110A from thesystem 100 if the monitor A 120A determines that the revenue generator A110A was properly identified as a fraudulent revenue generator. If themonitor A 120A determines that the revenue generator A 110A wasimproperly scored, the monitor A 120A may notify the service provider130 that the revenue generator A 110A was improperly scored.Alternatively or in addition, the monitor A 120A may manually modify asecurity status value associated with the revenue generator A 110A.

FIG. 2 provides a simplified view of a network environment implementinga system 200 for determining a score for revenue generators, such as ascore relating to a likelihood that a revenue generator 110A-N isfraudulent. The system 200 may include one or more web applications,standalone applications and mobile applications 210A-N, which may becollectively or individually referred to as client applications for therevenue generators 110A-N. The system 200 may also include one or moreweb applications, standalone applications, mobile applications 220A-N,which may collectively be referred to as client applications for themonitors 120A-N, or individually as a monitor client application. Thesystem 200 may also include one or more applications 150, such as webapplications, standalone applications and mobile applications whichcollectively may be referred to as applications for the users 150, orindividually as a user application. The system 200 may also include anetwork 230, and the service provider servers 240.

The revenue generators 110A-N may use a web application 210A, standaloneapplication 210B, or a mobile application 210N, or any combinationthereof, to communicate to the service provider servers 240, such as viathe network 230. The network 230 may include wide area networks (WAN),such as the internet, local area networks (LAN), campus area networks,metropolitan area networks, or any other networks that may allow fordata communication. Similarly, the monitors 120A-N may use either a webapplication 220A, a standalone application 220B, or a mobile application220N to communicate to the service provider servers 240, via the network230. The service provider servers 240 may communicate to the revenuegenerators 110A-N via the network 230, through the web applications,standalone applications or mobile applications 210A-N. The serviceprovider servers 240 may also communicate to the monitors 120A-N via thenetwork 230, through the web applications, standalone applications ormobile applications 220A-N.

The users 150 may use one or more applications 250 to communicate to theservice provider server 240. The applications 250 may include webapplications, standalone applications, or mobile applications. Theservice provider servers 240 may communicate to the users 150 via thenetwork 230, through the applications 250.

The web applications, standalone applications and mobile applications210A-N, 220A-N may be connected to the network 230 in any configurationthat supports data transfer. This may include a data connection to thenetwork 230 that may be wired or wireless. Any of the web applications,standalone applications and mobile applications 210A-N, 220A-N mayindividually be referred to as a client application. The webapplications 210A, 220A may run on any platform that supports webcontent, such as a web browser or a computer, a mobile phone, or anyappliance capable of data communications.

The standalone applications 210B, 220B may run on a machine that mayhave a processor, memory, a display, and an interface. The processor maybe operatively connected to the memory, display and the interface andmay perform tasks at the request of the standalone applications 210B,220B or the underlying operating system. The memory may be capable ofstoring data. The display may be operatively connected to the memory andthe processor and may be capable of displaying information to therevenue generator B 110B or the monitor B 120B. The interface may beoperatively connected to the memory, the processor, and the display. Thestandalone applications 210B, 220B may be programmed in any programminglanguage that supports communication protocols. These languages mayinclude: SUN JAVA, C++, C#, ASP, SUN JAVASCRIPT, asynchronous SUNJAVASCRIPT, or ADOBE FLASH ACTIONSCRIPT, amongst others. The standaloneapplications 210B, 220B may be third party standalone applications ormay be third party servers.

The mobile applications 210N, 220N may run on any mobile device that mayhave a data connection. The data connection may be a cellularconnection, a wireless data connection, an internet connection, aninfra-red connection, a Bluetooth connection, or any other connectioncapable of transmitting data. The aforementioned descriptions of the webapplications, standalone applications and mobile applications may alsoapply to the applications 250.

The service provider servers 240 may include one or more of thefollowing: an application server, a data source, such as a databaseserver, and a middleware server. The service provider servers 240 mayco-exist on one machine or may be running in a distributed configurationon one or more machines. The service provider servers 240 maycollectively be referred to as the server.

There may be several configurations of database servers, applicationservers and middleware servers that may support such a system 200.Database servers may include MICROSOFT SQL SERVER, ORACLE, IBM DB2 orany other database software, relational or otherwise. The applicationserver may be APACHE TOMCAT, MICROSOFT IIS, ADOBE COLDFUSION, YAPACHE orany other application server that supports communication protocols. Themiddleware server may be any middleware that connects softwarecomponents or applications.

FIG. 3 provides a view of a system 300 that may be used for scoringrevenue generators 100, such as the revenue generators 110A-N. Thesystem 300 may include a historical data source 320, a server 310 forclassifying revenue generators 110A-N, a revenue generator data source380, a monitor A 120A and a revenue generator A 110A. The server 310 mayinclude a classifier model generator 330, a classifier model 340, ascoring metric 350, and a revenue generator processing component 360.The server 310, the historical data store 320 and the revenue generatordata store 380 may be components of the service provider servers 240.Furthermore the components of the server 310, the classifier modelgenerator 330, the classifier model 340, the scoring metric 350 and therevenue generator processing component 360 may reside on one server ormay be distributed across several servers.

The server 310 may run on a machine that may have a processor, memory, adisplay, and an interface. The processor may be operatively connected tothe memory, display and the interface and may perform tasks at therequest of the classifier model generator 330, the classifier model 340,the scoring metric 350, the revenue generator processing component 360,or the underlying operating system. The memory may be capable of storingdata. The display may be operatively connected to the memory and theprocessor and may be capable of displaying information to the revenuegenerator A 110A or the monitor A 120A. The interface may be operativelyconnected to the memory, the processor, and the display and may becapable of communicating to or interacting with the revenue generator A110A and the monitor A 120A.

In operation, the historical data source 320 may supply the classifiermodel generator 330 with historical revenue generator data. Thehistorical revenue generator data may be processed by the classifiermodel generator 330 to create one or more features, or characteristics,which may be able to describe the behavior of revenue generators 110A-Nwho have been affirmatively identified as either fraudulent or notfraudulent. The features may be combined into a feature vector. Thefeatures or the feature vector may be inputs to the machine learningalgorithm.

In the case of a revenue generator A 110A who is an online advertiser,the historical revenue generator data may include any of theaforementioned data values describing online advertisers, such as a bidamount, a search term bid on, a daily budget value, the URL theadvertisement directs users 150 to, the change history of the account,the spend history of the account, the spend to replenish ratio of theaccount, which may represent the amount the revenue generator A 110Aspends in relation to the amount the revenue generator A 110A uses toreplenish their account, the average amount of payment, the number oftimes the credit card associated with the account is charged in a month,the total sum of charges accrued in a month, the average accountadjustment amount, the credit rating of the account owner, the totalnumber of adjustments, a client run rate representing the amount anadvertiser spends on all of their accounts per day and any other datacollected that may assist in generating a classifier model 340. Otherrevenue generators 110A-N, such as service provider partners, mayutilize different historical revenue generator datasets.

The historical data may be processed by the classifier model generator330 to create several features that may describe the behavior of arevenue generator A 110A and may be used as inputs to a machine learningalgorithm used to generate the classifier model 340. There may beseveral features that may be used to describe a revenue generator A110A. The features may include: a client tier value, a security statusvalue, a risk status value, a client age value represented by the numberof days the revenue generator A 110A has maintained an account with theservice provider 130, a client search term max-spend-score, a clientsearch term risk score, a client search term risk-spend score, a clientrisk-run rate score, a client max spend over daily budget score, aclient run-rate rate of change score, a client run rate over dailybudget score and a client card score.

The features that may be combined into the feature vector may beidentified by using clustering, such as K-Means clustering, and otherdata mining techniques on the historical data or may be identified bymanual analysis of the features. The feature selection may be performedon unlabeled data. Forward feature selection or backward feature removalmay be used in identifying the features. Forward feature selection mayinclude starting with zero features, adding one feature in eachiteration, testing or calculating the information gain of the new set offeatures in each iteration and selecting the best new feature in eachiteration (a local maximum). Backward feature removal include startingwith all features, removing one feature each iteration and stopping whenthe removal of any feature may reduce the information gain by a certainpercentage.

A value for each of the features may be calculated periodically,randomly and/or for determined time intervals, such as for the followingperiods of time: hourly, daily, monthly, or any other time period themonitor A 120A may deem useful in determining whether the revenuegenerator A 110A is a fraudulent revenue generator. The system 100 mayalso calculate an exponentially weighted moving average of the set ofthe last M time period scores, where M may be a number of days (up toand including the previous day) specified by one of the monitors 120A-N.M may represent a number of days or a number of months. Any of the timeperiod values of the features or the exponentially weighted movingaverage of the features may be used as an input to the machine learningalgorithm.

The exponentially weighted moving average may apply weighting factorswhich decrease exponentially. The weighting for each period of time maydecrease exponentially, giving much more importance to recentobservations while still not discarding older observations entirely. Thedegree of weighing decrease may be expressed as a constant smoothingfactor α, a number between 0 and 1. α may be specified by a monitor A120A and may be expressed as a percentage; so a smoothing factor of 10%may be equivalent to α=0.1. The exponentially weighted moving averagemay also be referred to as an exponential moving average, or an EMA. AnEMA calculation may be a standard statistical calculation and moreinformation may be available at:http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc431.htm.

For each of the aforementioned features and time periods, a number ofscores of the features for the given time period may be calculated. Thescores may include an absolute score of the feature for the time period,a relative score of the feature for the time period, an absolute rate ofchange score of the feature for the time period, and a relative rate ofchange score of the feature for the time period. Any of these scores maybe used as inputs to the machine learning algorithm.

The absolute score of each feature may represent the actual calculatedscore of each feature for the given time period. The relative score mayrepresent a percentile of an absolute score of a feature associated withthe revenue generator A 110A, in relation to all other absolute scoresfor the given feature for the other revenue generators 110B-N. A scoreof a feature referenced without an identifier, such as “absolute” or“relative,” may refer to the absolute score of the feature.

The relative score of each feature may be calculated by multiplying theabsolute score of the feature by a percentile. The percentile mayrepresent how the absolute score of the feature of a revenue generator A110A compares with the absolute scores of all the other revenuegenerators 110B-N. A percentile calculation may be a standardstatistical calculation and more information may be found at:http://www.itl.nist.gov/div898/handbook/prc/section2/prc252.htm.

The rate of change score of each feature may indicate the rate that thevalue of the feature may change from period to period. The rate ofchange score may be a positive value or a negative value. To calculatethe rate of change score of the feature the monitor A 120A may need toindicate a number of previous time periods, M, to calculate the rate ofchange score over, where M is greater than 1. The EMA for the featuremay then be calculated over the M periods. The rate of change of thefeature may be calculated by subtracting the EMA value of the absolutescore of feature for the last M time periods from the absolute score ofthe feature and then dividing the result by the number of periods, M.The rate of change calculations may include calculating derivatives anddetermining a polynomial equation that best fits the scores of thefeature for the time period identified.

The client tier value may describe the value of a revenue generator A110A may bring to the service provider 130, as identified by a monitor A120A. The client tier value may also be automatically set by the system100 based on the amount of revenue the revenue generator A 110A may havegenerated for the service provider 130. The client tier value may beassociated with a text identifier describing the value. The followingmay be examples of text identifiers associated with client tier values:“Unknown,” “New,” “Standard,” “Premier,” “Gold,” “24K Gold,” “Platinum,”“Diamond,” “Old,” and “Super Diamond.” The text identifier may be usedto facilitate the monitors 120A-N in interpreting the client tier value.

The security status value may be a nominal value that may relate to thelikelihood that a revenue generator A 110A may interact properly orimproperly with the service provider 130. The security status value maybe identified by a monitor A 120A and may typically only be set by theservice provider 130 when a revenue generator A 110A initially signs up,such as when a revenue generator A 110A initially signs up as anadvertiser. The security status value may be associated with a textidentifier describing the security status value. The following may beexamples of text identifiers of the security status value: “OfflineFraudulent,” “Online Verified,” “Offline Unverified,” “Online Verified,”“Online High Risk,” and “Offline High Risk.” Other text identifiers mayalso be used to describe the security status value of the revenuegenerators 110A-N.

The risk status value may be a nominal value that may relate to thelikelihood that a revenue generator A 110A may interact properly orimproperly with the service provider 130. The risk status value may bedetermined by the service provider 130 based on the transaction historyof a revenue generator A 110A. Typically the risk status value may beset automatically by the service provider 130. The risk status value maybe associated with a text identifier describing the status. Thefollowing may be examples of text identifiers for risk status:“Acceptable Online,” “Unacceptable Offline.” Other text identifiers mayalso be used to describe the risk status of the revenue generators120A-N.

The client age value may represent the number of days since a revenuegenerator A 110A initially interacted with the service provider 130,such as when the revenue generator A 110A initially signed up to becomean advertiser. The client age value may be associated with an account ofthe revenue generator A 110A.

The client search term max-spend-score may represent the aggregation ofall the individual search term max-spend-scores associated with each ofthe accounts associated with a revenue generator A 110A. The accountsearch term max-spend-score may be calculated by aggregating theindividual search term max-spend-scores for each search term bid on in agiven account.

An individual search term max-spend score may indicate the maximumamount a revenue generator A 110A may be willing to spend for a searchterm. This may indicate the spending intentions of the revenue generatorA 110A for the search term and not the actual amount spent by therevenue generator A 110A. To calculate the search term max-spend-scorefor each search term bid on, the service provider server 130 may firstneed to determine a rank-percent value.

The rank-percent value may be calculated by using a curve to approximatethe percentage of clicks one of the revenue generators 110A-N, such asthe revenue generator A 110A, may obtain for a given listing based onthe listing's rank in the search results. This curve may have thepercentage of clicks on the y-axis and the rank on the x-axis. All ofthe values of the y-axis may add up to 1.0. There may be a curve usedfor mature markets, where many bids may be submitted for a given searchterm, and there may be other curves for other stages of market maturity.The individual search term-max spend score may be calculated bymultiplying the rank-percent by the bid amount of the revenue generatorA 110A for the search term and by the search term's average daily clickvolume, representing the number of times users 150 may click on anadvertisement in a given day after searching for the search term.

A max spend change event may represent an event that makes the searchterm max-spend-score eligible to be recalculated, such as a bid change,a search term addition, or a search term deletion. The number of maxspend change events may represent the total number of max spend changeevents in a given time period or may represent the average number of maxspend change events over the course of a given time period. The numberof max spend change events may be a feature.

The client search term risk score may represent the aggregation of allof the search term risk scores associated with each of the accounts of arevenue generator A 110A. The individual search term risk score for asearch term may vary according to whether the search term has beenassociated with any known fraudulent accounts. The value of anindividual search term risk score may be 1 if no risk is associated withthe search term, or if the search term has been associated with a knownfraudulent account the value of the search term risk score may be 1000multiplied by the number of times the search term has been associatedwith a known fraudulent account. Alternatively or in addition to, othermultipliers may be used, such as 10, or 100. A search term with no riskassociated with it, and therefore a search term risk score of 1, may bereferred to as a non-risk search term and a search term with riskassociated with it may be referred to as a risk search term.

In order to calculate the account search term risk score several initialvariables may need to be calculated, which may be defined as: x, y, a,and b. X may be calculated by subtracting the number of search terms inthe account that may have been associated with a fraudulent account fromthe number of risk search terms in the account. Y may be calculated bysubtracting the number search terms in the account that have not beenassociated with a fraudulent account from the number of non-risk searchterms in the account. A may be calculated by aggregating both the searchterm risk score of all the risk search terms in the account and thesearch term risk score of all of the search terms in the account thatmay have been associated with a fraudulent account. B may be aconfigurable value, such as 1.2. The account search term risk score maythen be calculated by first raising a to the b power, then adding y tothe result, and lastly dividing the sum by the sum of x plus y. Thiscalculation may be mathematically represented as (a**b+y)/(x+y), wherethe symbol “**” may represent an exponential operator. The total numberof search terms in the account may be equal to x plus y.

The client search term risk-spend score may represent the aggregate ofall the account search term risk-spend scores associated with a revenuegenerator A 110A. The score may intend to measure the combined risk ofthe search terms of the revenue generator A 110A with the amount ofmoney the revenue generator A 110A may be willing to spend. A revenuegenerator A 110A who may spend a high dollar amount for high-risk searchterms may get a higher score than a revenue generator B 110B who mayspend a high dollar amount for low-risk search terms. In order tocalculate the account search term risk-spend score for a given account,the classifier model generator 330 may first need to calculate theaccount search term max-spend-score and a relative account search termrisk score. The account search term max-spend-score may be calculatedaccording to the method elaborated above and the relative account searchterm risk score may be calculated per the method for calculatingrelative scores elaborated above.

Once the account search term max-spend-score and the relative accountsearch term risk scores have been calculated, the classifier modelgenerator 330 may calculate the account search term risk-spend score. Abase and exponent calculation may be used in calculating the accountsearch term risk-spend score. The base value may be equal to the accountsearch term risk score and the exponent may be equal to a configurablevalue, ‘A’, such as 1.2.

The account search term risk-spend score may be calculated by raisingthe base to the power of the exponent and multiplying the result by theaccount search term max-spend-score. This calculation may bemathematically represented as: account search term-max-spend score *(account search term risk score ** A), where the symbol “**” representsan exponential operator.

The client search term risk-run rate score may be determined by findingthe maximum account search term risk-run rate score for any accountassociated with a revenue generator A 110A. In order to calculate theaccount search term risk-run rate score, the account run rate score andthe account search term risk score may need to be calculated. Theaccount run rate may be the amount of money a revenue generator A 110Amay spend on an account for a given time period, such as a day. In thecase of a day time period, the account run rate may be calculated onceper day.

Once the account run rate and the account search term risk score havebeen calculated, the account search term risk-run rate score may becalculated. A base and exponent may be used in calculating the accountsearch term risk-run rate score. The base may represent the accountsearch term risk score and the exponent may represent a configurablevalue, ‘A’, such as 1.2. The account search term risk-run rate score maybe calculated by raising the base to the power of the exponent and thenmultiplying the result by the account run rate. This calculation may bemathematically represented as: account run rate * (account search termrisk score ** A), where the symbol “**” represents an exponentialoperator. A high value for the account search term risk-run rate scoremay indicate that a client may be willing to spend a relatively highamount of money for a relatively low number of clicks.

The client max spend over daily budget score for a revenue generator A110A may be calculated by finding the maximum value of the account maxspend over daily budget score for any account associated with therevenue generator A 110A. The account max spend over daily budget scoremay be calculated by using the account search term max-spend-scorecalculated for the previous day, and the daily time weighted average ofthe daily budget for the previous day. The daily budget may representthe maximum amount a revenue generator A 110A may be willing to spendfor a given account on a given day. A time weighted average of the dailybudget may need to be calculated if the budget amount changes over thecourse of a day. The account max spend over daily budget score may becalculated by dividing the account max-spend-score by the daily budget.The account max-spend-score may indicate what the revenue generator A110A may be willing to pay for a given account if there were no budget.When the account max spend over daily budget score is very high it maymean that the revenue generator A 110A may be willing to receive veryfew clicks relative to their budget.

The client run-rate rate of change score may be equal to the maximumvalue of the account run-rate rate of change score for any accountassociated with the revenue generator A 110A. Calculating the individualaccount run-rate rate of change scores may include calculating theexponentially weighted moving average for the daily account run rate forthe previous M days, where M is greater than 1. The daily account runrate may be the amount the revenue generator A 110A may spend on theaccount for a given day, calculated once per day. The account run-raterate of change scores may then be calculated by subtracting the EMAvalue of the run rate score from the account run rate value. The clientrun-rate rate of change calculations may include calculating derivativesand determining a polynomial equation that best fits the client run-ratescores for the given time period.

The client run rate over daily budget score for a revenue generator A110A may be determined by finding the maximum account run rate overdaily budget score for any account associated with the revenue generatorA 110A. The account run rate over daily budget score may be calculatedby dividing the account run rate by the daily time weighted averagedaily budget for the account. For example, the account run rate overdaily budget score calculation may use the account run rate calculatedfor the previous day, and the time weighted average of the daily budgetof the account for the previous day. The daily budget may represent themaximum amount a revenue generator A 110A may be willing to spend for agiven account on a given day.

The client card score may represent the worst credit rating value of anycredit card associated with any of the accounts of a revenue generator A110A. The account card score may represent the worst credit rating valueof any credit card associated with a particular account. The creditrating value may be an AFS score, which may be a CYBERSOURCE ADVANCEDFRAUD SCREEN credit card transaction score. The value of an AFS scoremay range between 1 and 99, where 99 may represent a credit cardtransaction most likely to be fraudulent, and 1 may represent a creditcard transaction least likely to be fraudulent, or vice-versa. Otherranges may also be used. The credit rating value may also be supplied byother credit rating metrics, or any other credit card processor. In thecase of an AFS score, the client card score may represent the highestcard AFS score for any account associated with the revenue generator A110A.

The historical data may include a data field that classifies the revenuegenerators 110A-N as fraudulent or not fraudulent. This determinationmay have been made by one of the monitors 120A-N based on the historicalbehavior of the revenue generators 110A-N. There may be other valuesthat may be used to identify the revenue generator A 110A, but do notassist in classifying the revenue generator A 110A as fraudulent, suchas the account id or account name of the revenue generator A 110A.Furthermore, there may be other features that may assist in classifyinga revenue generator A 110A who is an online advertiser that may beidentified by the service provider 130 or by one of the monitors 120A-N,such as an account age, an age factor, and a spend to replenish ratio.

The replenish rate may represent the rate the account replenishes itsfunds, which may be represented by the sum of payments per month dividedby the number of payments per month. The run rate may represent the rateat which the account spends its funds. The spend to replenish ratio mayrepresent the amount the revenue generator A 110A spends in relation tothe amount the revenue generator A 110A uses to replenish their account.

The classifier model generator 330 may combine the features identifiedin the historical data or any other features into a feature vector to besubmitted as inputs to a machine learning algorithm to generate theclassifier model 340. The monitor A 120A or the service provider 130 mayidentify a machine learning algorithm to be used in generating theclassifier model 340, such as a C4.5 algorithm, and may identify whichdata fields of the revenue generator data may be used in generating theclassifier model 340. Other machine learning algorithms may include anydecision trees, such as ID3, or C4.5 decision trees, artificial neuralnetworks, pattern recognition with K-nearest neighbor, classifiers,maximum margin classifiers such as a support vector machine, orprobability based classifiers, such as a Bayes classifier or a naïveBayes classifier.

The revenue generator processing component 360 may interact with therevenue generator A 110A, may collect data relevant to the revenuegenerator A 110A, and may store the collected data in the revenuegenerator data source 380. The revenue generator processing component360, may process the collected data relevant to the revenue generator A110A to create the previously mentioned feature vector to input into theclassifier model. The revenue generator processing component 360 maythen submit the feature vector or other input data associated with therevenue generator A 110A into the classifier model 340.

The revenue generator processing component 360 may process the datacollected from the revenue generator A 110A and input the processed datato the classifier model 340 each time new data is collected relevant tothe revenue generator A 110A or in predetermined intervals of time.Alternatively or in addition, another server in the service providerservers 240 may collect data relevant to the revenue generators 110A-Nand store the data in a data source. In this case the revenue generatorprocessing component may retrieve data relating to the revenuegenerators 110A-N directly from the data source.

The classifier model 340 may submit the results of the classification tothe scoring metric 350. The results of the classification may include alist of the classes available for classification, such as “fraudulent”or “not fraudulent,” and a weight associated with each class. The weightmay indicate the likelihood that the data belongs to the class theweight is associated with. The weights may be between 0 and 1 and theaggregate of all the weights may equal 1.

The scoring metric 350 may apply a scoring metric to the classifierresults to generate a composite score. The scoring metric 350 may be ametric that converts the classifier results into the composite score.The significance of the composite score may be easily understood by amonitor A 120A. The metric may be a multiplier, such as 1000, that maybe applied to a weight of one of the classes, such as the “notfraudulent” class. For example, if the weight associated with the class“not fraudulent” was 0.8, the scoring metric may convert the weight intoa score of 800. The scoring metric 350 may use other data associatedwith the revenue generators 110A-N in converting the classifier resultsinto a score. The scoring metric 350 may obtain the data from theclassifier model 340, the revenue generator processing component 360 ordirectly from the revenue generator data source 380.

For example, the scoring metric 350 may take an average of theclassifier results and any combination of the other scores mentionedabove. There may be other formulas used to convert the classifierresults to a score which may be identified by a monitor A 120A orpredetermined by the service provider 130. Any of these formulas may beused to generate the composite score which may be communicated to therevenue generator processing component 360.

The revenue generator processing component 360 may take some actionbased on the composite score of the revenue generator A 110A. Forexample, if the composite score is below a certain threshold, therevenue generator processing component 360 may set the risk status valueof the revenue generator A 110A to “Unacceptable Offline” and may notifya monitor A 120A that the revenue generator A 110A requires attention.There may be other actions that the revenue generator processingcomponent 360 may automatically perform based on the score of therevenue generator A 110A. After taking any such actions, the revenuegenerator processing component 360 may communicate the scored classifierresults and any other scores associated with the revenue generator A110A to the monitor A 120A. In some instances the revenue generatorprocessing component 360 may not communicate the scores associated withthe revenue generator A 110A to a monitor A 120A.

The monitor A 120A may review the composite score and any other scoresassociated with the revenue generator A 110A. The monitor A 120A mayhandle the revenue generator A 110A by updating the security statusvalue of the revenue generator A 110A based on the composite score andother scores of the revenue generator A 110A, such as by changing thesecurity status value of the revenue generator A 110A to “OfflineFraudulent,” or by changing a spend limit of the revenue generator. Thespend limit may represent the maximum amount the revenue generator A110A may spend in a given time period, such as a day. There may be onespend limit for the revenue generator A 110A, or there may be a separatespend limit for each account of the revenue generator A 110A.

The monitor A 120A may determine whether the composite score associatedwith the revenue generator A 110A accurately reflects the revenuegenerator A 110A's historical behavior. If the monitor A 120A determinesthat the composite score associated with the revenue generator A 110Adoes not accurately reflect the behavior of the revenue generator A110A, the monitor A 120A may add the data associated with the revenuegenerator A 110A to the historical data source 320, and classify thedata as either fraudulent or not fraudulent. The classification of thedata may be based on the monitor A 120A's expert opinion regarding thebehavior of the revenue generator A 110A. The step of adding the dataassociated with the revenue generator A 110A to the historical datasource 230 may also be performed by the server 310. In this instance themonitor A 120A may only communicate to the server 310 that the revenuegenerator A 110A may have been improperly classified. The server 310 mayexecute the remaining steps.

The classifier model generator 330 may then reprocess the historicaldata from the historical data source 320 and may re-input the processeddata into the identified machine learning algorithm. After a newclassifier model 340 has been generated, the revenue generatorprocessing component 360 may re-input the data associated with therevenue generator A 110A to the classifier model 340. The new compositescore displayed to the monitor A 120A may properly reflect the behaviorof the revenue generator A 110A. If the composite score does notaccurately reflect the behavior of the revenue generator A 110A, themonitor A 120A may attempt to correct the composite score by modifyingthe feature vector inputted to the machine learning algorithm, selectinga different machine learning algorithm, or adjusting the scoring metric350.

FIG. 4 illustrates basic operations that may be used with the systems ofFIG. 1, FIG. 2, FIG. 3 or other systems for scoring revenue generators.At block 410 the classifier model generator 330 may generate aclassifier model 340 by inputting processed historical revenue generatordata to a machine learning algorithm. The features inputted to themachine learning algorithm may be any of the features enumerated above.The machine learning algorithm may be any of the types of machinelearning algorithms enumerated above or any other machine learningalgorithm capable of classifying data.

At block 420 the server 310 may store the classifier model 340. At block430 the server 310 may obtain current revenue generator data frominteractions with one of the revenue generators 110A-N, such as revenuegenerator A 110A. At block 440 the revenue generator processingcomponent 360 may process the data of the revenue generator A 110A tocreate the inputs to the classifier model 340. The server 310 may thenapply the classifier model 340 to the processed data of revenuegenerator A 110A. At block 450, the server 310 may apply the scoringmetric 350 to the results of the classification of the data associatedwith the revenue generator A 110A. The scoring metric 350 may be in anyof the previously enumerated forms, such as a multiplier of 1000 tocreate a composite score. At block 460 the composite score and any otherscores associated with the revenue generator A 110A may be communicatedto one of the monitors 120A-N, such as the monitor A 120A. The monitor A120A may review the scores and perform an action on the revenuegenerator A 110A, such as changing security status value of the revenuegenerator A 110A.

FIG. 5 further illustrates operations that may be used with the systemsof FIG. 1, FIG. 2, FIG. 3 or other systems for scoring revenuegenerators. At block 510 the server 310 may identify a historicaldataset containing revenue generator data that may have been classifiedby an expert, such as the monitor A 120A. The expert may have classifiedthe revenue generator data based on the historical behavior of therevenue generators. This historical data may be mined from historicalrevenue generator transaction databases or may be supplied by a thirdparty.

At block 520, the classifier model generator 330 may process the datasetto generate data capable of describing the behavior of the revenuegenerators 110A-N and accurately classifying the revenue generators110A-N. The process may further include determining which inputs formclusters by using a clustering algorithm such as the K-Means algorithm.Any combination of the aforementioned scores may be capable of acting asinputs to the machine learning algorithm.

At block 525 the service provider 130 or one of the monitors 120A-N maydetermine a machine learning algorithm best structured for using theidentified inputs to create a classifier model 340 for classifying therevenue generators 110A-N. Each of the machine learning algorithmsenumerated above may be capable of classifying the revenue generators110A-N.

The process of identifying the inputs in block 520 and the process ofselecting a machine learning algorithm in block 525 may be coupledtogether. The process of identifying the proper inputs may be a processof recursively cycling inputs through a machine learning algorithm todetermine which inputs maximize the information gain. The informationgain may represent how accurately the inputs can classify historicalrevenue generator data. Both processes may be performed by one of themonitors 120A-N, or any individual capable of determining the properinputs.

Once the inputs capable of classifying revenue generators 110A-N and amachine learning algorithm best structured for handling the inputs havebeen identified, the system 100 may move to block 530. At block 530 theclassifier model generator 330 may input the processed historical datainto the machine learning algorithm to generate a classifier model 340.At block 535 the server 310 may store the classifier model 340.

At block 540 the service provider 130 may collect data associated with arevenue generator, such as the revenue generator A 110A. This data mayrelate to the aforementioned scores or any other data that may correlateto the behavior of the revenue generator A 110A. The data may becollected by the revenue generator processing component 360, or throughanother set of servers. If the data is collected through a remote set ofservers, the server 310 may mine the data directly from the remoteservers.

Once the server 310 has collected new data on a revenue generator, suchas the revenue generator A 110A, the system 100 may move to block 560.At block 560 the revenue generator processing component 360 may processthe collected data to generate the proper inputs and then the classifiermodel 340 may classify the processed data. Alternatively or in addition,the revenue generator processing component 360 may process and classifythe data for specified time intervals. At block 565 the server 310 mayapply a scoring metric 350 to the results of the classifier model 340 todevelop a composite score. The scoring metric 350 may be based on any ofthe aforementioned calculations. Furthermore the scoring metric 350 mayassociate a range of weights of a given classification with anidentifier that may be an alphanumeric character, a symbol, an image, orany other representation that may be useful in converting the classifierresults into a format easily understood by the monitors 120A-N or otherpotential users of the system 100. For instance, the scoring metric 350may associate weights of the classification “not fraudulent” between 0.8and 1.0 with 5 stars, 0.6 to 0.8 may be associated with 4 stars, and 0.0to 0.2 may be associated with 1 star.

Once the server 310 has applied the scoring metric 350 to the classifiermodel results, the system 100 may move to block 570. At block 570 theserver 310 may communicate the composite score of the revenue generatorA 110A to a monitor A 120A. In some instances the server 310 may notcommunicate all composite scores to a monitor A 120A. The monitor A 120Amay review the composite score and the other associated scores todetermine whether the composite score accurately reflects the behaviorof the revenue generator A 110A. The monitor A 120A may also handle therevenue generator A 110A based on the composite score of the revenuegenerator A 110A. For instance the monitor A 120A may handle the revenuegenerator A 110A by modifying the security status value of the revenuegenerator A 110A based on the composite score, such as by changing thesecurity status value to “Offline Fraudulent” if the composite scoreindicates that the revenue generator A 110A may be likely to commitfraud.

If the monitor A 120A determines that the composite score accuratelyreflects the behavior of the revenue generator A 110A, then the system100 may return to block 540 and continue to collect data on the revenuegenerator A 110A. If the monitor A 120A determines that the compositescore does not accurately reflect the behavior of the revenue generatorA 110A, the system 100 may move to block 575. At block 575, the monitorA 120A may notify the server 310 that the revenue generator A 110A wasscored incorrectly. The monitor A 120A may make this determination byanalyzing any currently available scores and data relating to therevenue generator A 110A. Other users may also be able to notify theserver 310 or the service provider 130 of an improperly scored revenuegenerator.

The server 310 may be automatically notified of an improperly scoredrevenue generator A 110A any time one of the monitors 120A-N or otherusers changes the security status value of one of the revenue generators110A-N, such as the revenue generator A 110A.

At block 580 the server 310 may add the data of the revenue generator A110A to the historical data source 320, along with the correctclassification of the data. At block 585 the classifier model generator330 may generate a new classifier model 340 with the updated historicaldata. At block 590 the new classifier model 340 may be stored by theserver 310. After storing the new classifier model 340 the system 100may move to block 560 where the revenue generator processing component360 may reprocess and re-input the data related to the revenue generatorA 110A to the classifier model 565. If the monitor A 120A notifies theserver 310 that the revenue generator A 110A was again scoredincorrectly, the server 310 may need to adjust the inputs to the machinelearning algorithm, may need to adjust the scoring metric 350, or mayneed to select a new machine learning algorithm as elaborated above.

FIG. 6 illustrates a process that may be taken by one of the revenuegenerators 110A-N in the systems of FIG. 1, FIG. 2, FIG. 3 or othersystems for scoring revenue generators. At block 605 a revenue generatorA 110A may interact with the system 100, such as by logging into thesystem 100. At block 610 the revenue generator A 110A may provideinitial data to the service provider 130, such as when the revenuegenerator A 110A initially signs up for the service. The initial datamay include a name, address, credit card number, initial listings, orany other data that may be required by the service provider 130. Atblock 615, the classifier model 340 may generate an initialclassification for the revenue generator A 110A based on the initialdata. Alternatively or in addition, the system 100 may not attempt toclassify the data associated with the revenue generator A 110A until acertain period of time after the initial sign on of the revenuegenerator A 110A, or until the service provider 130 has collected acertain amount of data associated with the revenue generator A 110A.

At block 620 the server 310 may apply the scoring metric 350 to theresults from the classifier model 340 to generate a composite score forthe revenue generator A 110A. At block 625 the server 310 may modify therisk status value or security status value of the revenue generator A110A based on the composite score of the revenue generator A 110A. Themonitor A 120A may set ranges of composite scores that may correspond tothe security status values of the revenue generators 110A-N. The monitorA 120A may select an option to have the server 310 automatically changethe security status value of the revenue generator A 110A based on thecomposite score of the revenue generator A 110A. The monitor A 120A maybe able to select this option for an individual revenue generator A 110Aor across all revenue generators 110A-N. The system 100 may notify themonitor A 120A anytime the security status value or risk status value ofa one of the revenue generators 110A-N is modified.

There are several statuses that may be associated with a revenuegenerator A 110A, such as security status, which may be set by amonitor, and risk status, which may be set by the server 310. Thesecurity status may have values such as “Offline Fraudulent,” “OnlineVerified,” “Offline Unverified,” “Online Verified,” “Online High Risk,”or “Offline High Risk.” The risk status may have values of “AcceptableOnline” and “Unacceptable Offline.” The server 310 may assign a newrevenue generator A 110A with a security status of “Online Unverified”and a risk status of “Offline Unacceptable” by default. If the revenuegenerator A 110A is assigned one of the “Offline” statuses, then all ofthe accounts of the revenue generator A 110A may be taken offline or mayremain offline. If the revenue generator A 110A is assigned one of the“Online” statuses, then all of the accounts of the revenue generator A110A may be placed online or may remain online.

At block 630 the server 310 determines whether the score of the revenuegenerator A 110A drops below a monitor review threshold or whether thestatus of the revenue generator A 110A was automatically modified by theserver 310. The monitor review threshold may be set by the server 310for all monitors 120A-N or may be set by each individual monitor 120A-N.The monitors 120A-N may select a separate monitor review threshold foreach individual revenue generator A 110A they may be associated with, orthe monitors 120A-N may set one monitor review threshold for all of therevenue generators 110A-N they may be associated with.

If the composite score of the revenue generator A 110A does not dropbelow the monitor review threshold and if the server 310 did not modifyany statuses of the revenue generator A 110A, then the system 100 maymove to block 670 where the service provider 130 may collect additionaldata on the behavior of the revenue generator A 110A. Upon collectingadditional data on the revenue generator A 110A, the system 100 mayreturn to block 615 and reclassify the data associated with the revenuegenerator A 110A.

If the composite score of the revenue generator A 110A drops below themonitor review threshold or if the server 310 modified one of thestatuses of the revenue generator A 110A, then the system 100 may moveto block 635 where the server 310 may notify the monitor associated withthe revenue generator A 110A, such as the monitor A 120A, that therevenue generator A 110A may require monitor review. The monitor A 120Amay log into the system 100 at block 640. At block 645 the monitor A120A may review the data and scores associated with the revenuegenerator A 110A. The server 310 may also notify the monitor 120A toreview a revenue generator based on random spot checks. For example, theserver 310 may select revenue generators 110A-N for monitor review atrandom intervals.

At block 650 the monitor A 120A may determine whether the revenuegenerator A 110A was improperly scored. If the monitor A 120A determinesthat the revenue generator A 110A was properly scored, the system 100may move to block 655 where the monitor A 120A may perform an action onthe account of the revenue generator A 110A, or may modify the securitystatus value of the revenue generator A 110A. The system 100 may thenmove to block 670 where the service provider 130 may collect additionaldata on the behavior of the revenue generator A 110A. Upon collectingadditional data on the revenue generator A 110A, the system 100 mayreturn to block 615 and reclassify the data associated with the revenuegenerator A 110A.

If, at block 650, the monitor A 120A determines that the revenuegenerator A 110A was improperly scored, then the system 100 may move toblock 660. At block 660, the monitor A 120A may notify the server 310 ofthe improperly scored revenue generator A 110A. At block 662, the server310 may correctly classify the data associated with the revenuegenerator A 110A and may add the data associated with the revenuegenerator A 110A to the historical data source 320. At block 665 theclassifier model generator 330 may input the new processed historicaldata to the learning algorithm to generate a new classifier model 340.After generating a new classifier model 340, the system 100 may move toblock 615 where the classifier model 340 may re-classify the improperlyscored data associated with the revenue generator A 110A.

FIG. 7 illustrates steps that may be taken by a monitor A 120A in asystem 100 for scoring revenue generators 110A-N. At block 710 themonitor A 120A may receive a notification from the server 310 that arevenue generator A 110A may need to be reviewed. The server 310 maysend this notification if one of the statuses of the revenue generator A110A has been modified or if the composite score of the revenuegenerator A 110A drops below a monitor review threshold. At block 720the monitor A 120A may log into the system 100. At block 730 the monitorA 120A may review the composite score, any additional scores ofrelevance and any other data that may be associated with the revenuegenerator A 110A.

At block 735 the monitor A 120A may determine whether the revenuegenerator A 110A was improperly scored. If the monitor A 120A determinesthat the revenue generator A 110A was improperly scored, the system 100may move to block 740. At block 740 the monitor A 120A may notify theserver 310 that the revenue generator A 110A was improperly scored. Theserver 310 may then take the aforementioned steps associated with animproperly scored revenue generator. At block 750 the monitor A 120A maymodify the status of the revenue generator A 110A to reflect thecomposite score and other scores associated with the revenue generator A110A. If the monitor A 120A determines that the revenue generator A 110Awas properly scored, the system 100 may move directly to block 750 wherethe monitor A 120A may modify the status of the revenue generator A110A.

FIG. 8 is a screenshot of an implementation of a monitor interface inthe systems of FIG. 1, FIG. 2, and FIG. 3 or any other system forscoring revenue generators. The screenshot 800 may include of a viewselect table 810, a view select submit button 820, a results table 830and a results submit button 840.

The view select table 810 may display various options the monitor A 120Amay select to display the data associated with a revenue generator A110A. The time view selection may allow the monitor A 120A to select thelength of the periods of time to display the data over in the resultstable 830. The monitor A 120A may also be able to select a range of timeto display results over. The algorithm view option may give a monitor A120A the option to view the results as classified through the currentclassifier model 340 or to see the results through any other classifiermodels that have been stored. The view only online accounts option maygive the monitor A 120A the option to only view online accounts. Oncethe monitor A 120A has selected their desired view options in the viewselect table 810, the monitor A 120A may click the view select submitbutton 820 to submit the selections to the server 310.

After the monitor A 120A has clicked the view select submit button 820,the results may be displayed to the monitor A 120A, in the results table830. The results table 830 may display information relating to aspecific account of the monitor A 120A, such as client name, the name ofthe revenue generator A 110A, the account Id, the composite scorereturned from the scoring metric 350, the age of the account,representing the number of days the revenue generator A 110A hasparticipated in the system 100, the overall status, representing whetherthe accounts of the revenue generator A 110A are online or offline, asecurity status option giving the monitor A 120A a method of modifyingthe status of the revenue generator A 110A, an account search term riskscore, a search term max-spend-score, a search term risk-spend score, asearch term risk-runRate score, representing the search term risk-runrate score, a max spend over daily budget score, a daily run rate overdaily budget score a max card AFS score, and a “View Details” link. Thesystem 100 may be configurable to display any of the data associatedwith the revenue generators 110A-N to the monitors 120A-N.

The security status dropdown box may contain options relating to thestatus of the revenue generators 110A-N, such as “online fraudulent,”“online verified,” “offline unverified,” “online high risk,” and“offline high risk.” The monitors 120A-N may modify the status of anydisplayed revenue generators 110A-N by modifying the security statusdropdown box and clicking the results table submit button 840. Themonitor A 120A may obtain detailed information about any of the listedrevenue generators 110A-N by clicking on the “View Details” link at theend of the row corresponding to a given revenue generator, such as therevenue generator A 110A.

FIG. 9 is a screenshot of an implementation of a detailed revenuegenerator account view of a monitor interface in the systems of FIG. 1,FIG. 2, and FIG. 3 or any other system for scoring revenue generators.The screenshot 900 may include a view select table 910, a view selectsubmit button 920, a security status select table 930, a security statussubmit button 940, a details display table 950, a graph 960, a compositescore line 970, and a review threshold line 980.

The view select table 910 may provide the monitor A 120A with options tochange the manner in which the data related to the referenced account ofthe revenue generator A 110A is displayed. The monitor A 120A may beable to select a time view, a time period, and a score history viewwhich may include the option of viewing data relating to any of theaforementioned features of the revenue generator A 110A. The monitor A120A may modify the view options and may click on the change view submitbutton 920 to submit the request for a different view of the data to theserver 310.

The details display table 950 may display details regarding the revenuegenerator A 110A and the account of the revenue generator A 110Areferenced by the screenshot 900, such as the type of payment plan forthe account, the number of payments per month made by the revenuegenerator A 110A for the account, the number of adjustments per month tothe account, the daily budget of the account, the account age, theclient tier value, represented as the client value, the daily run rateof the account, the last time a click feed was received for the account,the real time since the last received click feed, which may be displayedin any time metric such as seconds, minutes, hours, or days, thedefinitive balance of the account and the composite score of theaccount.

The graph 960 may display the data relating to the options selected bythe monitor A 120A in the view select table 910 for the specified periodof time. The composite score line 970 may provide information regardingthe composite score of the revenue generator A 110A over the timeperiod. The review threshold line 980 may provide information regardingthe review threshold for the revenue generators 110A-N. The lines 970,980 may individually or jointly provide a monitor A 120A withinformation regarding whether the revenue generator A 110A is afraudulent revenue generator.

If the monitor A 120A determines that the security status of the revenuegenerator A 110A needs to be modified, the security status select table930 may provide the monitor A 120A with a method to modify the securitystatus of the revenue generator A 110A. A monitor A 120A may modify thesecurity status value in the security status select table 930 and thenclick on the security status submit button 940 to submit themodification to the server 310.

FIG. 10 is a screenshot of the output of a clustering algorithm run on ahistorical dataset. The clustering algorithm may be a K-means algorithmor some other clustering algorithm. The circled rows may representcentroids indicating a behavior type of several of the revenuegenerators 110A-N. The data clustering may assist in the featureselection process. If a value of a feature is common to all revenuegenerators 110A-N of a specific classification, such as a fraudulentrevenue generator classification, then the feature may be useful indescribing the behavior of the revenue generator.

FIG. 11 is a screenshot of a daily absolute advertiser search term riskgraph 1100 containing daily absolute advertiser search term risk data ofone or more revenue generators 110A-N that may be displayed to a monitorA 120A in place of, or in addition to the graph 930 of FIG. 9. They-axis of the daily absolute search term risk graph 1100 may represent aclient search term risk score and the x-axis of the daily absolutesearch term risk graph 1100 may represent a time value, such as a day.The daily absolute search term risk graph 1100 may contain two linesrepresenting data relating to the daily absolute client search term riskscore, a 5-day EMA line 1120 representing a 5 day exponential movingaverage, and an actual search term risk line 1130.

The 5-day EMA line 1120 may provide the monitor A 120A with informationon the 5 day exponential moving average value of the client search termrisk of the revenue generator A 110A over the time period. The actualsearch term risk line 1130 may provide the monitor A 120A withinformation on the client search term risk score of the revenuegenerator A 110A over the time period. The lines 1120 and 1130 mayindividually or jointly provide the monitor A 120A with informationregarding whether the revenue generator A 110A is a fraudulent revenuegenerator.

FIG. 12 is a screenshot of a daily relative advertiser search term riskgraph 1200 containing daily relative advertiser search term risk data ofone or more revenue generators 110A-N that may be displayed to a monitorA 120A in place of, or in addition to the graph 930 of FIG. 9. They-axis of the daily relative advertiser search term risk graph 1200 mayrepresent a relative client search term risk score and the x-axis of thedaily relative advertiser search term risk graph 1200 may represent atime value, such as a day. The daily relative advertiser search termrisk graph 1200 may contain three lines representing data relating tothe daily relative client search term risk score, a relative search termrisk line 1210, an average of all advertisers line 1220, and a medianline 1230.

The relative search term risk line 1210 may provide the monitor A 120Awith information on the client search term risk score of the revenuegenerator A 110A over the time period. The average of all advertisersline 1220 may provide the monitor A 120A with information on the averageclient search term risk score of all of the revenue generators 110A-Nover the time period. The median line 1230 may provide the monitor A120A with information on the median of the client search term risk scorefor the revenue generators 110A-N over the time period. The lines 1210,1220, and 1230 may individually or together provide the monitor A 120Awith information regarding whether the revenue generator A 110A is afraudulent revenue generator.

FIG. 13 is a screenshot of a daily spend amount for an advertiser graph1300 containing daily spend amount data of one or more revenuegenerators 110A-N that may be displayed to a monitor A 120A in place of,or in addition to the graph 930 of FIG. 9. The y-axis of the daily spendamount for an advertiser graph 1300 may represent an amount, such as adollar amount and the x-axis of the daily spend amount for an advertisergraph 1300 may represent a time value, such as a day. The daily spendamount for an advertiser graph 1300 may contain three lines representingdata relating to the daily spend amount for a revenue generator A 110A,an actual spend line 1310, a 5-day EMA line 1320 representing a 5 dayexponential moving average, and a 10-day EMA line 1330 representing a 10day exponential moving average. Alternatively or in addition to, theadvertiser graph may show an EMA for any period of time, which may bespecified by a monitor A 120A.

The actual spend line 1310 may provide the monitor A 120A withinformation on the client spend score of the revenue generator A 110Aover the time period, representing the total amount spent by the revenuegenerator A 110A over the time period. The 5-day EMA line 1320 mayprovide the monitor A 120A with information on the 5 day exponentialmoving average value of the client spend score of the revenue generatorA 110A. The 10-day EMA line 1330 may provide the monitor A 120A withinformation on the 10 day exponential moving average value of the clientspend score of the revenue generator A 110A. The lines 1310, 1320, and1330 may individually or jointly provide the monitor A 120A withinformation regarding whether the revenue generator A 110A is afraudulent revenue generator.

FIG. 14 is a screenshot of an hourly spend amount for an advertisergraph 1400 containing daily spend amount data of one or more revenuegenerators 110A-N that may be displayed to a monitor A 120A in place of,or in addition to the graph 930 of FIG. 9. The y-axis of the hourlyspend amount for an advertiser graph 1400 may represent an amount, suchas a dollar amount and the x-axis of the hourly spend amount for anadvertiser graph 1400 may represent a time value, such as an hour. Thehourly spend amount for an advertiser graph 1400 may contain three linesrepresenting data relating to the hourly spend amount for a revenuegenerator A 110A, an actual spend line 1410, a 24-hour EMA line 1420representing a 24 hour exponential moving average, and a 5-hour EMA line1430 representing a 5 hour exponential moving average.

The actual spend line 1410 may provide the monitor A 120A withinformation on the client spend score of the revenue generator A 110Aover the time period, representing the total amount spent by the revenuegenerator A 110A over the time period. The 24 hour EMA line 1420 mayprovide the monitor A 120A with information on the 24 hour exponentialmoving average value of the client spend score of the revenue generatorA 110A. The 5 hour EMA line 1430 may provide the monitor A 120A withinformation on the 5 hour exponential moving average value of the clientspend score of the revenue generator A 110A. The lines 1410, 1420, and1430 may individually or jointly provide the monitor A 120A withinformation regarding whether the revenue generator A 110A is afraudulent revenue generator.

FIG. 15 is a screenshot of a monthly spend amount for an advertisergraph 1500 containing daily spend amount data of one or more revenuegenerators 110A-N that may be displayed to a monitor A 120A in place of,or in addition to the graph 930 of FIG. 9. The y-axis of the monthlyspend amount for an advertiser graph 1500 may represent an amount, suchas a dollar amount and the x-axis of the monthly spend amount for anadvertiser graph 1500 may represent a time value, such as an month. Themonthly spend amount for an advertiser graph 1500 may contain threelines representing data relating to the monthly spend amount for arevenue generator A 110A, an actual spend line 1510, a 3-month EMA line1520 representing a 3 month exponential moving average, and a 6-monthEMA line 1530 representing a 6 month exponential moving average.

The actual spend line 1510 may provide the monitor A 120A withinformation on the client spend score of the revenue generator A 110Aover the time period, representing the total amount spent by the revenuegenerator A 110A over the time period. The 3-month EMA line 1520 mayprovide the monitor A 120A with information on the 3 month exponentialmoving average value of the client spend score of the revenue generatorA 110A. The 6-month EMA line 1530 may provide the monitor A 120A withinformation on the 6 month exponential moving average value of theclient spend score of the revenue generator A 110A. The lines 1510,1520, and 1530 may individually or jointly provide the monitor A 120Awith information regarding whether the revenue generator A 110A is afraudulent revenue generator.

FIG. 16 is a component diagram 1600 of an implementation of a system forscoring revenue generators 110A-N. The model 1600 may include a scoringsystem 1610, and a listing service 1620. The scoring system 1610 mayinclude a data scrubber 1630, a revenue generator score component 1640,and a trainer 1650. The listing service 1620 may provide a userinterface or an API that allows a monitor A 120A to see the list ofrevenue generators 110A-N, particularly the high risk revenue generators110A-N or the revenue generators 110A-N likely to commit fraud. A thirdparty may use the API to access the scoring system 1610, such as byaccessing the scoring system as a monitor. This may allow for additionaluses of a system for scoring revenue generators 100 in the future.

The data scrubber 1630 may process the revenue generator data or thehistorical revenue generator data in order to generate the features orthe feature vectors. The data scrubber 1630 may perform aggregation andbinning when necessary and may store the processed revenue generatordata in temporary tables for processing. The revenue generator scorecomponent 1640 may score the revenue generators 110A-N. The revenuegenerator score component 1640 may take the output of the data scrubber1630 as input. The revenue generator score component 1640 may rely onthe trainer 1650 to train the machine learning algorithm in order togenerate a classifier model 340. The trainer 1650 may train the machinelearning algorithm on what characteristics represent a high risk revenuegenerator and what characteristics represent a low risk revenuegenerator. The trained machine learning algorithm may be the classifiermodel 340. The training set used may be rows of data from the datascrubber 1630. The each row of data may represent a feature vector. Thetraining data may have been manually identified as high risk revenuegenerators and low risk revenue generators. The revenue generator scorecomponent 1640 may then use the classifier model 340 to score revenuegenerators 110A-N.

FIG. 17 is a class diagram 1700 of an implementation of a system forscoring revenue generators. The class diagram 1700 may contain a trainer1710, a classifier interface 1720, a classifier model interface 1730, adata cache interface 1740, a data loader interface 1750, an instanceinterface 1760, a classifier factory 1770, a feature 1780, a statsutility 1790, and a revenue generator scoring component 1795.

The trainer 1710 may be associated with the classifier and may representa component of the system 100 that utilizes a machine learning algorithmto generate the classifier model 340. The classifier interface 1720 maybe associated with the trainer 1710, the classifier factory 1770, theclassifier model interface 1730, and the advertiser score 1795. Theclassifier interface 1720 may have a train method, which takes aninstance of the data loader interface 1750 as an input, aclassifyInstance method, a persistModel method and a loadModel method.

The train method may utilize the trainer 1710 to generate a classifiermodel 340. The classifyInstance method may return an instance of theclassifier interface 1720, the persistModel method may store thegenerated classifier model 340, and the loadModel method may load theclassifier model 340. The classifier 1720 may contain the basic methodsand actions associated with the classifier model 340.

The classifier model interface 1730 may be associated with theclassifier interface 1720. The classifier model interface 1730 may havea toXML method and a toBinArray method. The toXML method may convert thedata representing the classifier model 340 into XML format. ThetoBinArray method may convert the data representing the classifier modelinto a binary array format. The classifier model interface 1730 mayimplement additional methods and actions that may be used by theclassifier model 340.

The classifier factory 1770 may be associated with the classifier 1720.The classifier factory 1770 may implement a getInstance method, abuildClassifier method and a setOptions method. The getInstance methodmay return an instance of the classifier model 340. The buildClassifiermethod may generate the classifier model 340. The setOptions method mayset options related to the generation of the classifier model 340.

The data cache interface 1740 may be associated with the data loaderinterface 1750. The data cache interface 1740 may store data associatedwith the revenue generators 11A-N, the monitors 120A-N, and theclassifier model 340.

The data loader interface 1750 may be associated with the data cacheinterface 1740, the classifier instance 1720, and the instance interface1760. The data loader interface may have a hasMore method and a nextmethod. The hasMore method may determine if there is any additionaldata. The next method may output an instance 1760. The data loaderinterface 1750 may load the data needed by the system 100.

The instance interface 1760 may be associated with the feature 1780 andthe data loader interface 1750. The instance interface 1760 may includea getFeature method that takes an index as an input and outputs afeature 1780, and a setFeature method that takes an index, and a featureas inputs. The getFeature method may return a feature 1780 identified bythe index. The setFeature method may set the value of the feature 1780.The feature 1780 may be associated with the instance interface 1760. Thefeature 1780 may represent a feature of the revenue generator data orthe feature vector of the revenue generator data.

The stats util 1790 may be a standalone component. The stats util mayimplement a mean method, a mode method and a variance method. The meanmethod may compute the mean of a feature 1780. The mode method maycompute the mode of a feature 1780. The variance method may compute thevariance of a feature 1780. The stats util may be used by one of themonitors 120A-N, or some other user to review data and scores associatedwith revenue generators 110A-N.

The revenue generator scoring component 1795 may be associated with theclassifier interface 1720. The revenue generator scoring component 1795may be responsible for scoring the revenue generators 110A-N. Therevenue generator scoring component 1795 may be associated with aninstance of the classifier interface 1720. The revenue generator scoringcomponent 1795 may implement a scoreRevenueGenerator method. ThescoreRevenueGenerator method may take an identification variableassociated with one of the revenue generators 110A-N, such as a variableassociated with the revenue generator A 110A, and may output a score ofthe revenue generator A 110A.

FIG. 18 is a use case 1800 of an implementation of a system for scoringrevenue generators. The use case 1800 may include a data scrubber 1802,a trainer 1818, a monitor 1832 a revenue generator score 1826 and anadmin 1814.

The data scrubber 1802 may create the revenue generator data model atstep 1804. This step may include processing the historical revenuegenerator data to develop the features or feature vector used to createthe classifier model 340. This step may also include processing the datafrom one of the revenue generators 110A-N once it is collected by theservice provider 130.

The trainer 1818 may create a training set at step 1820. The trainingset may be the set of processed historical data that may be inputted tothe machine learning algorithm to create the classifier model 340. Thehistorical data may be processed by the data scrubber 1802 at step 1804and then may be compiled by the trainer 1818 at 1820. The trainer 1818may then use the training set including the processed historical data,to train the classifier model 340 at 1822. The trainer 1818 may thenstore the classifier model 340 at step 1824, in the form of the revenuegenerator score 1826.

The revenue generator score 1826 may load the classifier model at step1830 and may score one of the revenue generators 110A-N at step 1828.This step may include applying the classifier model 340 to the datacollected by the service provider 130 and processed by the data scrubber1802.

The monitor 1832 may be one of the monitors 120A-N. The monitor 1832 mayview individual activities of the revenue generators 110A-N at 1834. Themonitor 1832 may view the bidding behavior of one of the revenuegenerators 110A-N, such as the revenue generator 110A, at 1838. Themonitor 1832 may view the transaction history of the revenue generator110A at 1840. The monitor 1832 may view the search terms bid on by therevenue generator 110A at 1842. The monitor 1832 may list all of therevenue generator scores at 1836. The monitor 1832 may change thestatus, or security status, of the revenue generator A 110A at step1844. The monitor 1832 may change a status associated with a URL, suchas mark a specific URL as being associated with a fraudulent revenuegenerator or ban a URL from being used ever again, at step 1850. Themonitor 1832 may modify a status associated with a search term, such asmark a search term as bid on by a fraudulent revenue generator orincrease a search term risk score associated with a search term when thesearch term is used fraudulently at step 1848. The monitor may modifythe status of an account or all accounts of one of the revenuegenerators 110A-N at step 1846.

The admin 1814 may control administrative functions of the system 100.The admin 1814 may be one of the monitors 120A-N, or may be some otherperson. The admin 1814 may change the configuration of the system 100 atstep 1812. At 1810 the admin 1814 may change the configuration of theclassifier model 340, or any processes associated with generating theclassifier model 340, such as modifying the machine learning algorithmused to generate the classifier model 340. At 1816, the admin 1814 maychange parameters of the system 100 as a whole.

FIG. 19 is a graphic demonstrating the machine learning process that maybe used in a system for scoring revenue generators 100. FIG. 19 maydemonstrate the cycle of the steps involved in the machine learningprocess. The system 100 may collect historical data on revenuegenerators 110A-N. The system 100 may train a machine learning algorithmwith the historical data, which may result in the generation of aclassifier model 340. The system 100 may score new revenue generators110A-N with the classifier model 340. The system 100 may suggestmodifications to the status of the revenue generators 110A-N. One of themonitors 120A-N, such as the monitor A 120A may review the suggestionsof the machine learning algorithm or classifier model 340, and maydetermine whether the revenue generators 110A-N were properly scored. Ifthe monitor A 120A determines that one of the revenue generators 110A-Nwas improperly scored, the monitor A 120A may take the steps associatedwith properly classifying the revenue generator data. The reclassifieddata may then be collected and used to train the machine learningalgorithm, which may generate a more accurate classifier model 340.

The revenue generators 110A-N may represent any entities that maygenerate revenue for the service provider 130, such as advertisers, webcontent publishers or other partners, auction participants, or generallyany entity that may generate revenue for the service provider 130 andmay interact with the service provider 130 in a fraudulent manner. Themonitors A 120A-N may be human users or may be automated machine users.Any machine learning algorithm may function within the bounds of thesystem 100 if the machine learning algorithm is capable of classifying arevenue generator A 110A as fraudulent or as not fraudulent.

The system 100 may also be adapted to identify service providerpartners, such as publishers, who may be profitable and those who maynot be profitable. Publishers may be service provider partners who mayserve advertisements of advertisers, supplied to the publishers by theservice provider 130, to the users 150. When the users 150 view or clickon an advertisement of one of the advertisers, the advertisers may paythe service provider 130. The service provider 130 may then pay thepublisher. Thus the service provider partners may also be revenuegenerators 110A-N.

The system 100 may assist the service provider 130 in identifying whichrevenue generators 110A-N, such as service provider partners, may beprofitable and which may not be profitable. The system 100 may alsoidentify which service provider partners may be profitable for serving aparticular advertisement, or a group or category of advertisements, andwhich service provider partners may not be profitable for serving aparticular advertisement or group or category of advertisements.Furthermore, the system 100 may identify which pages on a serviceprovider partner may be more profitable or less profitable for serving aparticular advertisement or group or category of advertisements. In thiscase the system 100 may use some or all of the features identifiedabove, along with one or more additional features relating to aprofitable service provider partner or unprofitable service providerpartner, to generate the classifier model 330.

The service provider 130 may take actions based on the informationprovided by the system 100, such as to end the partnership with aservice provider partner, serve less advertisements to a serviceprovider partner, serve more advertisements to a service providerpartner, or serve specific advertisements, groups of advertisements orcategories of advertisements to a service provider partner.Advertisements may be grouped or categorized based on several factors,including demographics, geographic location, industry sectors, or anygrouping of advertisements that may be identified as more profitable orless profitable when served by a given service provider partner.

In the case of demographics, the advertisements may be grouped based onthe demographics of users 150 who historically click on theadvertisements most often. For example, the system 100 may have agrouping of the top fifty advertisements clicked on most often by malesages 18-39. The system 100 may group sites together based on anydemographics of users 150 that may be identified as more profitable orless profitable when served by a given service provider partner.

In the case of advertisements grouped together based on a geographicarea, the system 100 may group the advertisements that are relevant to ageographic area. The advertisements in the group may refer toadvertisers who may physically be located within the geographic area orthe advertisements may be relevant to the geographic area based on someother factor.

In the case of advertisements grouped together based on industrysectors, the advertisements included in an industry sector grouping mayinclude the advertisements of any entities involved in the industrysector, the advertisements of trade journals or publications relating tothe industry sector, the advertisements of professional organizationsrelated to the industry sector, or any other associated advertisementsrelating to the industry sector. Advertisements may be grouped under anyother category which may be more profitable or less profitable whenserved by a particular service provider partner.

The illustrations described herein are intended to provide a generalunderstanding of the structure of various embodiments. The illustrationsare not intended to serve as a complete description of all of theelements and features of apparatus and processors that utilize thestructures or methods described herein. Many other embodiments may beapparent to those of skill in the art upon reviewing the disclosure.Other embodiments may be utilized and derived from the disclosure, suchthat structural and logical substitutions and changes may be madewithout departing from the scope of the disclosure. Additionally, theillustrations are merely representational and may not be drawn to scale.Certain proportions within the illustrations may be exaggerated, whileother proportions may be minimized. Accordingly, the disclosure and thefigures are to be regarded as illustrative rather than restrictive.

One or more embodiments of the disclosure may be referred to herein,individually and/or collectively, by the term “invention” merely forconvenience and without intending to voluntarily limit the scope of thisapplication to any particular invention or inventive concept. Moreover,although specific embodiments have been illustrated and describedherein, it should be appreciated that any subsequent arrangementdesigned to achieve the same or similar purpose may be substituted forthe specific embodiments shown. This disclosure is intended to cover anyand all subsequent adaptations or variations of various embodiments.Combinations of the above embodiments, and other embodiments notspecifically described herein, may be apparent to those of skill in theart upon reviewing the description.

The Abstract is provided with the understanding that it will not be usedto interpret or limit the scope or meaning of the claims. In addition,in the foregoing Detailed Description, various features may be groupedtogether or described in a single embodiment for the purpose ofstreamlining the disclosure. This disclosure is not to be interpreted asreflecting an intention that the claimed embodiments require morefeatures than are expressly recited in each claim. Rather, as thefollowing claims reflect, inventive subject matter may be directed toless than all of the features of any of the disclosed embodiments. Thus,the following claims are incorporated into the Detailed Description,with each claim standing on its own as defining separately claimedsubject matter.

The above disclosed subject matter is to be considered illustrative, andnot restrictive, and the appended claims are intended to cover all suchmodifications, enhancements, and other embodiments, which fall withinthe true spirit and scope of the present invention. Thus, to the maximumextent allowed by law, the scope of the present invention is to bedetermined by the broadest permissible interpretation of the followingclaims and their equivalents, and shall not be restricted or limited bythe foregoing detailed description.

1. A method for monitoring a service provider partner, comprising:identifying a historical dataset corresponding to a historical behaviorof a set of service provider partners; processing the historical datasetto identify a feature vector wherein the feature vector comprises a setof variables related to detecting a profitable service provider partner;generating a classifier model from the historical dataset and thefeature vector; collecting current service provider partner datarepresenting a current service provider partner; processing the currentservice provider partner data to generate a current service providerpartner data feature vector; generating a score by applying theclassifier model to the current service provider partner data featurevector, wherein the score represents a likelihood of the current serviceprovider partner generating large revenues; identifying a monitor; andnotifying the monitor of the score of the current service providerpartner.
 2. The method of claim 1 wherein generating the classifiermodel further comprises inputting the historical dataset and the featurevector to a machine learning algorithm to generate the classifier model.3. The method of claim 2 wherein the machine learning algorithmcomprises a decision tree.
 4. The method of claim 1 further comprisingmodifying the current service provider partner data based on the scoreof the current service provider partner.
 5. The method of claim 4wherein the current service provider partner data comprises a serviceprovider partner status.
 6. The method of claim 1 wherein generating thescore further comprises: generating a scoring metric; and applying thescoring metric to the score generated by the classifier model.
 7. Themethod of claim 6 wherein the scoring metric comprises a multiplier. 8.The method of claim 7 wherein the multiplier is
 1000. 9. The method ofclaim 1 where in the historical behavior of the set of service providerpartners is identified as profitable behavior or not profitablebehavior.
 10. The method of claim 1 further comprising: modifying thecurrent service provider partner data to include a classification value;adding the current service provider partner data to the historicalservice provider partner data; re-processing the historical serviceprovider partner data to generate the feature vector; and re-generatingthe classifier model from the historical service provider partner dataand the feature vector.
 11. The method of claim 10 wherein theclassification value identifies the current service provider partner asa profitable service provider partner.
 12. The method of claim 1 whereinthe current service provider partner comprises a web publisher.
 13. Amethod of monitoring service provider partners, comprising: collecting aservice provider partner data representing a service provider partner;processing the service provider partner data; generating a score of theservice provider partner data, based on the processed service providerpartner data, indicating the likelihood of the service provider partnerbeing a profitable service provider partner; and handling the serviceprovider partner data based on the score of the service provider partnerdata.
 14. The method of claim 13 wherein the handling of the serviceprovider partner further comprises: identifying a monitor; and notifyingthe monitor of the score of the service provider partner data.
 15. Themethod of claim 14 wherein the notified monitor modifies the serviceprovider partner data.
 16. The method of claim 13 wherein processing theservice provider partner data further comprises processing the serviceprovider partner data to identify a feature vector wherein the featurevector comprises a set of variables related to detecting serviceprovider partner profitability.
 17. The method of claim 13 wherein thehandling of the service provider partner data further comprisesmodifying the service provider partner data.
 18. The method of claim 17wherein the service provider partner data comprises a service providerpartner status.
 19. The method of claim 17 where in the service providerpartner data comprises a spend limit.
 20. A system for monitoring aservice provider partner, comprising: a memory to store a classifiermodel, a historical service provider partner dataset, a feature vector,a current service provider partner data and a current service providerpartner data feature vector, wherein the feature vector comprises a setof variables related to detecting a profitable revenue generator; aninterface operatively connected to the memory to collect the currentservice provider partner data from a current service provider partnerand to interact with a monitor; a processor operatively connected to thememory and the interface, which processes the historical serviceprovider partner dataset to identify the feature vector, generates theclassifier model from the historical dataset and the feature vector,processes the current service provider partner data to generate thecurrent service provider partner data feature vector, and generates ascore signifying a likelihood of the current service provider partnergenerating large revenues by applying the classifier model to thecurrent service provider partner data feature vector, identifies amonitor, and notifies the monitor of the score through the interface.21. The system of claim 20 wherein the processor modifies the currentservice provider partner data based on the score.
 22. The method ofclaim 21 wherein the current service provider partner data comprises aservice provider partner status.
 23. The method of claim 21 where in thecurrent service provider partner data comprises a service providerpartner spend limit.
 24. The system of claim 20 wherein the notifiedmonitor modifies the current service provider partner data.
 25. Thesystem of claim 20 wherein the processor generates a scoring metric andapplies the scoring metric to the score generated by the classifiermodel.
 26. The system of claim 20 wherein the classifier model isgenerated by using a machine learning algorithm.
 27. The system of claim26 wherein the machine learning algorithm comprises a decision tree. 28.The system of claim 20 wherein the historical service provider partnerdata comprises data identified as relating to a profitable serviceprovider partner and data identified as relating to a not profitableservice provider partner.